This readme file was generated on [2026-02-21] by [KRISTINA ANGELEVSKA]


GENERAL INFORMATION

Title of Dataset: Violent Crime in the City of Boston during 2023-2024 

Author/Principal Investigator Information
Name:Kristina Angelevska
ORCID:0009-0005-0530-8949
Institution:Independent Researcher
Email: angelevska.kristina@gmail.com

- Date of data collection: <2025–05-20)>
- Geographic location of data collection: <Boston, Massachusetts, United States of America>
- Funding sources:None 

SHARING/ACCESS INFORMATION

- Licenses/restrictions placed on the data:None
- Links to publications that cite or use the data:
-Manuscript Title: Environmental Risk Factors for Violent Crime in Boston: Comparative Analysis of aggravated assaults with and without shootings during 2024 using Risk Terrain Modeling (RTM)
Status: Under Review (2026)
Note: This crime data set was used for the analysis of aggravated assaults with and without shootings as an outcome event in Risk Terrain Modeling Diagnostics Utility (RTMDx), in the manuscript listed above.


- Links to other publicly accessible locations of the data:None
- Links/relationships to ancillary data sets:Yes
   - Yes. The Violent Crimes dataset 2023-2024.csv' was used as an outcome event alongside ancillary  environmental risk factor layers and a study area shape file. These ancillary datasets are not included in this repository. All files are used together for the risk terrain models in the Risk Terrain Modeling Diagnostics Utility (RTMDx).
 
- Was data derived from another source?
	- If yes, list source(s):
    -Boston Police Department, open-data portal available at the City of Boston, Analyze Boston: https://data.boston.gov/
- Recommended citation for this dataset:


DATA & FILE OVERVIEW

File List: <Violent Crime in the City Of Boston between 2023-2024. This file was derived from the original data set: Crime Incidents 2023-Present, produced by the Boston Police Department, and available at City of Boston, Analyze Boston: https://data.boston.gov/>.

- Relationship between files, if important:
- Additional related data collected that was not included in the current data package:
 - The place-based data (environmental risk factors) and the study boundary shape file are not included in this data repository. 
- Are there multiple versions of the dataset? - None.
	


METHODOLOGICAL INFORMATION

Description of methods used for collection/generation of data:
<The raw incident data was derived from the Crime Incident Reports 2023-present, available at the open-source data portal of Boston Police Department, City of Boston>

Methods for processing the data:
<The final dataset was derived from raw incident counts. The raw incident data contains records from the new  crime incident records system: Mark 43, records management system (RMS) databased used since 2019 by the Boston Police Department. The final data set included in the data repository, was created through the following filtering steps: >

<Filters:

-  Records were restricted to three types of offenses: aggravated assault, murder, non-negligent manslaughter and robbery. These three offenses are used as a combined metric of violent crime in the research study.They were used a selection criterion, to identify incidents that noted an increase within the timeframe of January 1, 2023 - December 31, 2024.

- Data exclusions:Incidents of violent crime (aggravated assaults, robbery, murder, and non-negligent manslaughter) were excluded, if:
  - They were missing latitudes and longitudes
  - Reporting areas were labeled as out of jurisdiction (OOJ) or external.  


<Instrument- or software-specific information needed to interpret the data:
<1) Microsoft Excel: Used for data cleaning, and derived variables with the use of logical formulas; 2) QGIS - software for spatial data processing, used to convert crime incidents originally available in latitudes and longitudes to a projected coordinate system (PCS), using meters and 3) Risk Terrain Modeling Diagnostics Utility (RTMDx)used to analyze the connection of environmental features (risk factors) to crime incident locations of aggravated assaults.


<DATA-SPECIFIC INFORMATION FOR: Violent Crime in the City of Boston during 2023-2024> 

1) Number of variables:20
2) Number of cases/rows:3507

3)Variable List: 

- Incident Number:
  - A unique identifier assigned by the Boston Police Department, to a specific criminal event.This ID is used to distinguish each individual crime incident, and prevent double counting. 

- Offense Code:
  -  Internal offense codes assigned by the Boston Police Department to each offense category.  Aggravated Assault is distinguished by code 423 (Aggravated Assault & Battery), Robbery is distinguished by code: 301 and Murder, non-negligent manslaughter is distinguished by code 111.

- Offense Description:
  - The specific incident category/offense as distinguished by the reporting agency. Three offense description are included in the data set: Aggravated Assault, Robbery and Murder, Non-negligent manslaughter. 

 -District
  - The administrative police district where the incident occurred. The districts types are alphanumeric and consist of 12 police districts, within the geographic and jurisdictional boundary of the Boston Police Department. 

- Reporting Area
  - Reporting areas, are included in a numeric format in the dataset and they are geographic units assigned by the police agency for incident tracking and statistical reporting. Multiple reporting areas can be included in a police district, or police precinct.

- Shooting
  - It is a binary indicator (dummy variable) that is used to identify the presence of a firearm-discharge during an incident.A value of “0” denotes absence of shooting, whereas a value of “1” denotes presence of shooting. This variable is used to isolate aggravated assaults with and without shootings in the RTM study.

- Year 
  - Calendar years in which the incidents occurred, 2023 and 2024.

- Month Number
  - Includes 12 months of the year in a numerical format. 

- Hour 
 -The hour of the day in which the incident occured, included in an integer format (0-23).

- Day of the Week: 
  -Days of the week when an incident occured as recorded by the reporting police agency (Boston Police Department). 
  - Values: Sunday, Monday, Tuesday, Wednesday,Thursday, Friday and Saturday.


Street 
- The reported street name or intersection where the incident occurred, and it includes names of single streets and intersections.

Latitude 
- The geographic coordinate representing the north-south position of the incident.
- Format: Decimal Degrees(WGS84)

Longitude
- The geographic coordinate representing the east-west position of the incident.
- Format: Decimal Degrees(WGS84)

Location: 
-  A combined spatial field containing both latitude and longitude coordinates.
- This field and the individual latitude/longitude fields were used as the source for coordinate transformation into a projected coordinate system using meters in QGIS.

Derived Variables
- The following variables were coded from the original Occurred_on_Date variable, using logical IF formulas to categorize temporal patterns for the spatial crime analysis in RTMDx.
- Variable removal: The original occurred_on_date combined timestamp was removed from the final data set and it has been replaced by “Date” and “Time” variables. 
 
- Date
   - The calendar date of the incident. 
   -Format: M/D/YY.
   -Coded from the original occurred_on_date variable, using logical IF formula in Microsoft Excel. 

- Time  
  -The specific time of the incident.
  -Format: HH:MM:SS (24-h clock).
  -Coded from the original occurred_on_date variable, using logical IF formula in Microsoft Excel. 

- Month Text
  -Month name assigned based on the incident date.
  -Coded from the month number variable, using logical IF formula in Microsoft Excel. 
  -Values: January through December.

- Season
   - The seasonal category assigned based on the incident month.
   -Values: Winter(December-February); Spring(March-May); Summer(June-August); Fall(September-November).
  -Logic: Derived from “date” variable using logical IF formula in Microsoft Excel. 

- Quarter
   - The specific quarter of the year assigned to the incident.
   -Values: Q1 (January-March); Q2 (April - June); Q3 (July -September); Q4(October - December).


- Type of Day 
  - Categorization of the incident date by weekday and weekend. 
  -Values: Weekday (Monday- Friday); Weekend (Saturday - Sunday).
  -Logic: Derived from “date” variable using logical IF formula in Microsoft Excel. 



4) Missing data codes: <list code/symbol and definition>
- Specialized formats or other abbreviations used:None
